312 ◾ Bioinformatics
“kaiju_db_refseq_xxxx-xx-xx.tgz”. To classify the short reads in our FASTQ files, you need
to run the following:
mkdir kaiju_output
kaiju -t kaijudb/nodes.dmp \
-f kaijudb/kaiju_db_refseq.fmi \
-i fastq_pure/ERR1823587_pure_R1-80.fastq.gz \
-j fastq_pure/ERR1823587_pure_R2-80.fastq.gz \
-o kaiju_output/ERR1823587.out \
-a greedy \
-z 4 -v
kaiju -t kaijudb/nodes.dmp \
-f kaijudb/kaiju_db_refseq.fmi \
-i fastq_pure/ERR1823601_pure_R1-80.fastq.gz \
-j fastq_pure/ERR1823601_pure_R2-80.fastq.gz \
-o kaiju_output/ERR1823601.out \
-a greedy \
-z 4 -v
kaiju -t kaijudb/nodes.dmp \
-f kaijudb/kaiju_db_refseq.fmi \
-i fastq_pure/ERR1823608_pure_R1-80.fastq.gz \
-j fastq_pure/ERR1823608_pure_R2-80.fastq.gz \
-o kaiju_output/ERR1823608.out \
-a greedy \
-z 4 -v
To learn more about these options, run “kaiju”. The indexing and classification require
around 128 GB RAM. We do not recommend using kaiju unless you have enough memory
and storage space.
After running the program successfully, you will need to convert the kaiju output file
into a summary table using “kaiju2table” command as follows:
kaiju2table -t kaijudb/nodes.dmp \
-n kaijudb/names.dmp \
-r taxonomic_level \
-o kaiju_output/ERR1823587_table.tsv \
kaiju_output/ERR1823587.out \
-l taxonomic,levels,separated,by,commas
kaiju2table -t kaijudb/nodes.dmp \
-n kaijudb/names.dmp \
-r taxonomic_level \
-o kaiju_output/ERR1823601_table.tsv \
kaiju_output/ERR1823601.out \
-l taxonomic,levels,separated,by,commas
kaiju2table -t kaijudb/nodes.dmp \
-n kaijudb/names.dmp \